Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 256782 |
| Missing cells | 40320 |
| Missing cells (%) | 1.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 27.4 MiB |
| Average record size in memory | 112.0 B |
Variable types
| NUM | 9 |
|---|---|
| CAT | 3 |
| BOOL | 1 |
| DATE | 1 |
VERSIE has constant value "256782" | Constant |
DATUM_BESTAND has constant value "256782" | Constant |
PEILDATUM has constant value "256782" | Constant |
TYPERENDE_DIAGNOSE_CD has a high cardinality: 1766 distinct values | High cardinality |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPC | High correlation |
GEMIDDELDE_VERKOOPPRIJS has 40320 (15.7%) missing values | Missing |
AANTAL_SUBTRAJECT_PER_ZPD is highly skewed (γ1 = 21.28466669) | Skewed |
Reproduction
| Analysis started | 2020-10-25 20:13:08.808320 |
|---|---|
| Analysis finished | 2020-10-25 20:13:38.764423 |
| Duration | 29.96 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| 1 |
|---|
| Value | Count | Frequency (%) | |
| 1 | 256782 | 100.0% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| 2020-10-11 |
|---|
| Value | Count | Frequency (%) | |
| 2020-10-11 | 256782 | 100.0% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| 2020-10-01 |
|---|
| Value | Count | Frequency (%) | |
| 2020-10-01 | 256782 | 100.0% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
JAAR
Date
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| Minimum | 2012-01-01 00:00:00 |
|---|---|
| Maximum | 2020-01-01 00:00:00 |
Histogram with fixed size bins (bins=9)
BEHANDELEND_SPECIALISME_CD
Real number (ℝ≥0)
| Distinct | 27 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 421.9037316 |
|---|---|
| Minimum | 301 |
| Maximum | 8418 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.0 MiB |
Quantile statistics
| Minimum | 301 |
|---|---|
| 5-th percentile | 302 |
| Q1 | 305 |
| median | 313 |
| Q3 | 322 |
| 95-th percentile | 335 |
| Maximum | 8418 |
| Range | 8117 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 921.5860959 |
|---|---|
| Coefficient of variation (CV) | 2.184351611 |
| Kurtosis | 71.13989567 |
| Mean | 421.9037316 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 8.545382675 |
| Sum | 108337284 |
| Variance | 849320.9321 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=27)
| Value | Count | Frequency (%) | |
| 305 | 36318 | 14.1% | |
| 313 | 33317 | 13.0% | |
| 303 | 29478 | 11.5% | |
| 330 | 20570 | 8.0% | |
| 316 | 17550 | 6.8% | |
| 308 | 13193 | 5.1% | |
| 306 | 10716 | 4.2% | |
| 324 | 10684 | 4.2% | |
| 301 | 10404 | 4.1% | |
| 304 | 8391 | 3.3% | |
| Other values (17) | 66161 | 25.8% |
| Value | Count | Frequency (%) | |
| 301 | 10404 | 4.1% | |
| 302 | 5585 | 2.2% | |
| 303 | 29478 | 11.5% | |
| 304 | 8391 | 3.3% | |
| 305 | 36318 | 14.1% |
| Value | Count | Frequency (%) | |
| 8418 | 3359 | 1.3% | |
| 1900 | 168 | 0.1% | |
| 390 | 663 | 0.3% | |
| 389 | 2777 | 1.1% | |
| 362 | 3890 | 1.5% |
| Distinct | 1766 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| 101 | 1074 |
|---|---|
| 402 | 1052 |
| 301 | 1025 |
| 403 | 1021 |
| 203 | 966 |
| Other values (1761) |
| Value | Count | Frequency (%) | |
| 101 | 1074 | 0.4% | |
| 402 | 1052 | 0.4% | |
| 301 | 1025 | 0.4% | |
| 403 | 1021 | 0.4% | |
| 203 | 966 | 0.4% | |
| 201 | 959 | 0.4% | |
| 401 | 862 | 0.3% | |
| 404 | 850 | 0.3% | |
| 802 | 841 | 0.3% | |
| 409 | 833 | 0.3% | |
| Other values (1756) | 247299 | 96.3% |
Frequencies of value counts
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.349814239 |
| Min length | 2 |
ZORGPRODUCT_CD
Real number (ℝ≥0)
| Distinct | 5891 |
|---|---|
| Distinct (%) | 2.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 440748534.1 |
|---|---|
| Minimum | 10501002 |
| Maximum | 998418081 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.0 MiB |
Quantile statistics
| Minimum | 10501002 |
|---|---|
| 5-th percentile | 28999036 |
| Q1 | 99799032 |
| median | 149599026 |
| Q3 | 990004004 |
| 95-th percentile | 990416049.9 |
| Maximum | 998418081 |
| Range | 987917079 |
| Interquartile range (IQR) | 890204972 |
Descriptive statistics
| Standard deviation | 429110698.5 |
|---|---|
| Coefficient of variation (CV) | 0.9735952937 |
| Kurtosis | -1.737371851 |
| Mean | 440748534.1 |
| Median Absolute Deviation (MAD) | 119600023 |
| Skewness | 0.4676016897 |
| Sum | 1.131762901e+14 |
| Variance | 1.841359916e+17 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 990004009 | 1893 | 0.7% | |
| 990004007 | 1859 | 0.7% | |
| 990003004 | 1820 | 0.7% | |
| 990004006 | 1477 | 0.6% | |
| 990356076 | 1326 | 0.5% | |
| 990356073 | 1227 | 0.5% | |
| 990003007 | 1177 | 0.5% | |
| 131999228 | 1136 | 0.4% | |
| 131999164 | 1122 | 0.4% | |
| 199299013 | 1073 | 0.4% | |
| Other values (5881) | 242672 | 94.5% |
| Value | Count | Frequency (%) | |
| 10501002 | 6 | < 0.1% | |
| 10501003 | 9 | < 0.1% | |
| 10501004 | 9 | < 0.1% | |
| 10501005 | 9 | < 0.1% | |
| 10501007 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 998418081 | 123 | < 0.1% | |
| 998418080 | 105 | < 0.1% | |
| 998418079 | 29 | < 0.1% | |
| 998418077 | 6 | < 0.1% | |
| 998418076 | 6 | < 0.1% |
| Distinct | 8772 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 497.7230764 |
|---|---|
| Minimum | 1 |
| Maximum | 154283 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.0 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 13 |
| Q3 | 100 |
| 95-th percentile | 1675 |
| Maximum | 154283 |
| Range | 154282 |
| Interquartile range (IQR) | 97 |
Descriptive statistics
| Standard deviation | 3099.267383 |
|---|---|
| Coefficient of variation (CV) | 6.226891077 |
| Kurtosis | 399.6134163 |
| Mean | 497.7230764 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 16.68000941 |
| Sum | 127806327 |
| Variance | 9605458.312 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1 | 42622 | 16.6% | |
| 2 | 20949 | 8.2% | |
| 3 | 13582 | 5.3% | |
| 4 | 10163 | 4.0% | |
| 5 | 7827 | 3.0% | |
| 6 | 6544 | 2.5% | |
| 7 | 5471 | 2.1% | |
| 8 | 4632 | 1.8% | |
| 9 | 4272 | 1.7% | |
| 10 | 3724 | 1.5% | |
| Other values (8762) | 136996 | 53.4% |
| Value | Count | Frequency (%) | |
| 1 | 42622 | 16.6% | |
| 2 | 20949 | 8.2% | |
| 3 | 13582 | 5.3% | |
| 4 | 10163 | 4.0% | |
| 5 | 7827 | 3.0% |
| Value | Count | Frequency (%) | |
| 154283 | 1 | < 0.1% | |
| 153907 | 1 | < 0.1% | |
| 151429 | 1 | < 0.1% | |
| 144723 | 1 | < 0.1% | |
| 113196 | 1 | < 0.1% |
| Distinct | 9409 |
|---|---|
| Distinct (%) | 3.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 580.6849273 |
|---|---|
| Minimum | 1 |
| Maximum | 239907 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.0 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 14 |
| Q3 | 108 |
| 95-th percentile | 1886 |
| Maximum | 239907 |
| Range | 239906 |
| Interquartile range (IQR) | 105 |
Descriptive statistics
| Standard deviation | 3917.510279 |
|---|---|
| Coefficient of variation (CV) | 6.746361227 |
| Kurtosis | 725.1548283 |
| Mean | 580.6849273 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 21.28466669 |
| Sum | 149109437 |
| Variance | 15346886.78 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1 | 41151 | 16.0% | |
| 2 | 20600 | 8.0% | |
| 3 | 13438 | 5.2% | |
| 4 | 10006 | 3.9% | |
| 5 | 7746 | 3.0% | |
| 6 | 6531 | 2.5% | |
| 7 | 5472 | 2.1% | |
| 8 | 4585 | 1.8% | |
| 9 | 4184 | 1.6% | |
| 10 | 3771 | 1.5% | |
| Other values (9399) | 139298 | 54.2% |
| Value | Count | Frequency (%) | |
| 1 | 41151 | 16.0% | |
| 2 | 20600 | 8.0% | |
| 3 | 13438 | 5.2% | |
| 4 | 10006 | 3.9% | |
| 5 | 7746 | 3.0% |
| Value | Count | Frequency (%) | |
| 239907 | 1 | < 0.1% | |
| 232484 | 1 | < 0.1% | |
| 231318 | 1 | < 0.1% | |
| 227664 | 1 | < 0.1% | |
| 221364 | 1 | < 0.1% |
| Distinct | 7695 |
|---|---|
| Distinct (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7519.273189 |
|---|---|
| Minimum | 1 |
| Maximum | 211638 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.0 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 41 |
| Q1 | 393 |
| median | 1655 |
| Q3 | 6171 |
| 95-th percentile | 36005 |
| Maximum | 211638 |
| Range | 211637 |
| Interquartile range (IQR) | 5778 |
Descriptive statistics
| Standard deviation | 17551.14893 |
|---|---|
| Coefficient of variation (CV) | 2.334154976 |
| Kurtosis | 33.53386049 |
| Mean | 7519.273189 |
| Median Absolute Deviation (MAD) | 1506 |
| Skewness | 5.052410862 |
| Sum | 1930814008 |
| Variance | 308042828.7 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 21 | 452 | 0.2% | |
| 26 | 385 | 0.1% | |
| 11 | 378 | 0.1% | |
| 25 | 369 | 0.1% | |
| 37 | 368 | 0.1% | |
| 19 | 362 | 0.1% | |
| 6 | 359 | 0.1% | |
| 14 | 358 | 0.1% | |
| 20 | 358 | 0.1% | |
| 17 | 357 | 0.1% | |
| Other values (7685) | 253036 | 98.5% |
| Value | Count | Frequency (%) | |
| 1 | 285 | 0.1% | |
| 2 | 310 | 0.1% | |
| 3 | 291 | 0.1% | |
| 4 | 337 | 0.1% | |
| 5 | 285 | 0.1% |
| Value | Count | Frequency (%) | |
| 211638 | 25 | < 0.1% | |
| 211089 | 23 | < 0.1% | |
| 209864 | 19 | < 0.1% | |
| 206394 | 17 | < 0.1% | |
| 203709 | 17 | < 0.1% |
| Distinct | 8509 |
|---|---|
| Distinct (%) | 3.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10571.53519 |
|---|---|
| Minimum | 1 |
| Maximum | 344226 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.0 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 50 |
| Q1 | 508 |
| median | 2257 |
| Q3 | 8644 |
| 95-th percentile | 50725 |
| Maximum | 344226 |
| Range | 344225 |
| Interquartile range (IQR) | 8136 |
Descriptive statistics
| Standard deviation | 25534.33712 |
|---|---|
| Coefficient of variation (CV) | 2.415385908 |
| Kurtosis | 37.938778 |
| Mean | 10571.53519 |
| Median Absolute Deviation (MAD) | 2070 |
| Skewness | 5.338831566 |
| Sum | 2714579949 |
| Variance | 652002372.2 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 38 | 330 | 0.1% | |
| 24 | 329 | 0.1% | |
| 82 | 325 | 0.1% | |
| 11 | 323 | 0.1% | |
| 39 | 317 | 0.1% | |
| 25 | 311 | 0.1% | |
| 13 | 306 | 0.1% | |
| 20 | 295 | 0.1% | |
| 6 | 294 | 0.1% | |
| 31 | 293 | 0.1% | |
| Other values (8499) | 253659 | 98.8% |
| Value | Count | Frequency (%) | |
| 1 | 235 | 0.1% | |
| 2 | 258 | 0.1% | |
| 3 | 260 | 0.1% | |
| 4 | 261 | 0.1% | |
| 5 | 264 | 0.1% |
| Value | Count | Frequency (%) | |
| 344226 | 25 | < 0.1% | |
| 340616 | 19 | < 0.1% | |
| 334458 | 23 | < 0.1% | |
| 323774 | 20 | < 0.1% | |
| 302856 | 17 | < 0.1% |
| Distinct | 242 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 658474.8873 |
|---|---|
| Minimum | 488 |
| Maximum | 1489520 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.0 MiB |
Quantile statistics
| Minimum | 488 |
|---|---|
| 5-th percentile | 43687 |
| Q1 | 248773 |
| median | 747052 |
| Q3 | 1006376 |
| 95-th percentile | 1345302 |
| Maximum | 1489520 |
| Range | 1489032 |
| Interquartile range (IQR) | 757603 |
Descriptive statistics
| Standard deviation | 423038.6495 |
|---|---|
| Coefficient of variation (CV) | 0.6424522145 |
| Kurtosis | -1.169335434 |
| Mean | 658474.8873 |
| Median Absolute Deviation (MAD) | 322609 |
| Skewness | 0.07813893512 |
| Sum | 1.690844985e+11 |
| Variance | 1.78961699e+11 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 880969 | 5102 | 2.0% | |
| 874276 | 4355 | 1.7% | |
| 844000 | 4348 | 1.7% | |
| 892885 | 4332 | 1.7% | |
| 872989 | 4269 | 1.7% | |
| 825506 | 4125 | 1.6% | |
| 1084292 | 3891 | 1.5% | |
| 1063992 | 3851 | 1.5% | |
| 1076299 | 3846 | 1.5% | |
| 1039023 | 3810 | 1.5% | |
| Other values (232) | 214853 | 83.7% |
| Value | Count | Frequency (%) | |
| 488 | 51 | < 0.1% | |
| 1294 | 120 | < 0.1% | |
| 1949 | 131 | 0.1% | |
| 2584 | 173 | 0.1% | |
| 5662 | 15 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1489520 | 2976 | 1.2% | |
| 1450641 | 3054 | 1.2% | |
| 1421880 | 3564 | 1.4% | |
| 1345302 | 3543 | 1.4% | |
| 1333146 | 3547 | 1.4% |
| Distinct | 242 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1039156.657 |
|---|---|
| Minimum | 510 |
| Maximum | 2578269 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.0 MiB |
Quantile statistics
| Minimum | 510 |
|---|---|
| 5-th percentile | 47358 |
| Q1 | 355194 |
| median | 989615 |
| Q3 | 1729145 |
| 95-th percentile | 2389572 |
| Maximum | 2578269 |
| Range | 2577759 |
| Interquartile range (IQR) | 1373951 |
Descriptive statistics
| Standard deviation | 729408.9524 |
|---|---|
| Coefficient of variation (CV) | 0.7019239571 |
| Kurtosis | -0.9481651724 |
| Mean | 1039156.657 |
| Median Absolute Deviation (MAD) | 663885 |
| Skewness | 0.3372764253 |
| Sum | 2.668367246e+11 |
| Variance | 5.320374198e+11 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1211808 | 5102 | 2.0% | |
| 1281564 | 4355 | 1.7% | |
| 1216288 | 4348 | 1.7% | |
| 1312623 | 4332 | 1.7% | |
| 1286264 | 4269 | 1.7% | |
| 1209908 | 4125 | 1.6% | |
| 2557655 | 3891 | 1.5% | |
| 2489689 | 3851 | 1.5% | |
| 2578269 | 3846 | 1.5% | |
| 2066523 | 3810 | 1.5% | |
| Other values (232) | 214853 | 83.7% |
| Value | Count | Frequency (%) | |
| 510 | 51 | < 0.1% | |
| 1494 | 120 | < 0.1% | |
| 2231 | 131 | 0.1% | |
| 2924 | 173 | 0.1% | |
| 5711 | 15 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2578269 | 3846 | 1.5% | |
| 2557655 | 3891 | 1.5% | |
| 2489689 | 3851 | 1.5% | |
| 2389572 | 3777 | 1.5% | |
| 2184851 | 3757 | 1.5% |
| Distinct | 3133 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 40320 |
| Missing (%) | 15.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3480.855162 |
|---|---|
| Minimum | 70 |
| Maximum | 287220 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.0 MiB |
Quantile statistics
| Minimum | 70 |
|---|---|
| 5-th percentile | 140 |
| Q1 | 460 |
| median | 1215 |
| Q3 | 3980 |
| 95-th percentile | 13220 |
| Maximum | 287220 |
| Range | 287150 |
| Interquartile range (IQR) | 3520 |
Descriptive statistics
| Standard deviation | 6565.924627 |
|---|---|
| Coefficient of variation (CV) | 1.886296419 |
| Kurtosis | 174.3252164 |
| Mean | 3480.855162 |
| Median Absolute Deviation (MAD) | 990 |
| Skewness | 7.96652267 |
| Sum | 753472870 |
| Variance | 43111366.2 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 105 | 1830 | 0.7% | |
| 160 | 1780 | 0.7% | |
| 110 | 1461 | 0.6% | |
| 180 | 1361 | 0.5% | |
| 185 | 1268 | 0.5% | |
| 145 | 1260 | 0.5% | |
| 300 | 1240 | 0.5% | |
| 140 | 1203 | 0.5% | |
| 165 | 1181 | 0.5% | |
| 500 | 1148 | 0.4% | |
| Other values (3123) | 202730 | 79.0% | |
| (Missing) | 40320 | 15.7% |
| Value | Count | Frequency (%) | |
| 70 | 226 | 0.1% | |
| 75 | 75 | < 0.1% | |
| 80 | 361 | 0.1% | |
| 85 | 909 | 0.4% | |
| 90 | 502 | 0.2% |
| Value | Count | Frequency (%) | |
| 287220 | 8 | < 0.1% | |
| 148910 | 3 | < 0.1% | |
| 142850 | 4 | < 0.1% | |
| 122155 | 4 | < 0.1% | |
| 116765 | 3 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 2020-10-11 | 2020-10-01 | 2016-01-01 | 301 | 557 | 70401002 | 15 | 15 | 44291 | 56007 | 1191136 | 1910144 | 1425.0 |
| 1 | 1.0 | 2020-10-11 | 2020-10-01 | 2016-01-01 | 301 | 554 | 70401002 | 12596 | 14295 | 200174 | 288400 | 1191136 | 1910144 | 1425.0 |
| 2 | 1.0 | 2020-10-11 | 2020-10-01 | 2016-01-01 | 301 | 559 | 70401002 | 58 | 60 | 4445 | 5485 | 1191136 | 1910144 | 1425.0 |
| 3 | 1.0 | 2020-10-11 | 2020-10-01 | 2016-01-01 | 301 | 554 | 70401003 | 97 | 98 | 200174 | 288400 | 1191136 | 1910144 | 625.0 |
| 4 | 1.0 | 2020-10-11 | 2020-10-01 | 2016-01-01 | 301 | 559 | 70401003 | 6 | 6 | 4445 | 5485 | 1191136 | 1910144 | 625.0 |
| 5 | 1.0 | 2020-10-11 | 2020-10-01 | 2016-01-01 | 301 | 557 | 70401003 | 10 | 10 | 44291 | 56007 | 1191136 | 1910144 | 625.0 |
| 6 | 1.0 | 2020-10-11 | 2020-10-01 | 2016-01-01 | 301 | 559 | 70401004 | 10 | 10 | 4445 | 5485 | 1191136 | 1910144 | NaN |
| 7 | 1.0 | 2020-10-11 | 2020-10-01 | 2016-01-01 | 301 | 557 | 70401004 | 4 | 4 | 44291 | 56007 | 1191136 | 1910144 | NaN |
| 8 | 1.0 | 2020-10-11 | 2020-10-01 | 2016-01-01 | 301 | 554 | 70401004 | 69 | 70 | 200174 | 288400 | 1191136 | 1910144 | NaN |
| 9 | 1.0 | 2020-10-11 | 2020-10-01 | 2016-01-01 | 301 | 554 | 70401006 | 2 | 2 | 200174 | 288400 | 1191136 | 1910144 | NaN |
Last rows
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 256772 | 1.0 | 2020-10-11 | 2020-10-01 | 2018-01-01 | 316 | 3504 | 991630064 | 4 | 5 | 17 | 22 | 445889 | 758280 | NaN |
| 256773 | 1.0 | 2020-10-11 | 2020-10-01 | 2018-01-01 | 316 | 3504 | 991630065 | 10 | 11 | 17 | 22 | 445889 | 758280 | 205.0 |
| 256774 | 1.0 | 2020-10-11 | 2020-10-01 | 2018-01-01 | 316 | 3518 | 991630069 | 1 | 1 | 1787 | 2573 | 445889 | 758280 | NaN |
| 256775 | 1.0 | 2020-10-11 | 2020-10-01 | 2018-01-01 | 316 | 3520 | 991630069 | 1 | 1 | 3378 | 5109 | 445889 | 758280 | NaN |
| 256776 | 1.0 | 2020-10-11 | 2020-10-01 | 2018-01-01 | 316 | 3520 | 991630070 | 15 | 16 | 3378 | 5109 | 445889 | 758280 | 2660.0 |
| 256777 | 1.0 | 2020-10-11 | 2020-10-01 | 2018-01-01 | 316 | 3518 | 991630070 | 1 | 1 | 1787 | 2573 | 445889 | 758280 | 2660.0 |
| 256778 | 1.0 | 2020-10-11 | 2020-10-01 | 2018-01-01 | 316 | 3517 | 991630070 | 1 | 1 | 1082 | 1552 | 445889 | 758280 | 2660.0 |
| 256779 | 1.0 | 2020-10-11 | 2020-10-01 | 2018-01-01 | 316 | 3522 | 991630070 | 1 | 1 | 1351 | 1871 | 445889 | 758280 | 2660.0 |
| 256780 | 1.0 | 2020-10-11 | 2020-10-01 | 2018-01-01 | 316 | 7610 | 991630070 | 1 | 1 | 114 | 149 | 445889 | 758280 | 2660.0 |
| 256781 | 1.0 | 2020-10-11 | 2020-10-01 | 2018-01-01 | 316 | 3521 | 991630070 | 1 | 1 | 287 | 372 | 445889 | 758280 | 2660.0 |